NetNews Offline 2

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Offline 2 / NetNews Offline Volume 2.iso / news / comp / lang / c++-part1 / 5542 < prev next >

Wrap

Internet Message Format | 1996-08-05 | 6.7 KB

Path: cs.utexas.edu!not-for-mail From: wilson@cs.utexas.edu (Paul Wilson) Newsgroups: comp.lang.lisp,comp.lang.c++,comp.lang.c,comp.lang.misc,comp.theory Subject: allocator studies (was Re: GC & traditional allocators & textbooks) Date: 5 Feb 1996 09:56:51 -0600 Organization: CS Dept, University of Texas at Austin Distribution: inet Message-ID: <4f59c3$7il@jive.cs.utexas.edu> References: <rvillDL4v3n.I8r@netcom.com> <823365355snz@wildcard.demon.co.uk> <4f2ila$6p8@jive.cs.utexas.edu> <823455623snz@wildcard.demon.co.uk> NNTP-Posting-Host: jive.cs.utexas.edu In article <823455623snz@wildcard.demon.co.uk>, Cyber Surfer <cyber_surfer@wildcard.demon.co.uk> wrote: >In article <4f2ila$6p8@jive.cs.utexas.edu> > wilson@cs.utexas.edu "Paul Wilson" writes: > >> The history of allocator >> research has been a big mess---the literature is a bit of a disaster >> area---and the textbooks reflect this. The analyses in the books are >> shallow and largely wrong. (This is less attributable to the textbooks >> authors than the weak discussions of GC. It's not their fault that they >> can't summarize the gist of the literature and get it right, because >> the literature in general is disorganized, inconsistent, and often wrong.) > >I won't argue with that! ;-) I'd love to see a good summary. Well, very briefly: 1. Traditional allocator research has largely missed the point, because the true causes of fragmentation haven't been studied much. Most studies (a recent exception being Zorn, Grunwald, and Barrett's work at the U. of Colorado at Boulder, and Phong Vo's latest paper from Bell Labs) have used synthetic traces whose realism has never been validated, rather than real program behavior. 2. The standard synthetic traces of "program" behavior are unrealistic, because they're based on pseudo-random sequences, even if the object size and lifetime distributions are realistic. For good allocator policies, this causes an unrealistically high degree of fragmentation, because you're shooting holes in the heap at random. For some programs and some allocators, though, you get unrealistically low fragmentation. In general, the random trace methodology makes most allocators' performance look pretty similar, when in fact the regularities in real traces make some allocators work *much* better than others. 3. People have often focused on the speed of allocator mechanisms, at the expense of studying the effects of policy in a realistic context. As it turns out, some of the best policies are amenable to fast implementations if you just think about it as a normal data-structures-and-algorithms problem. The best policies have often been neglected because people confused policy and mechanism, and the best-known mechanisms were slow. 4. People have usually failed to separate out "true" fragmentation costs---due to the allocator's placement policy and its interactions with real program behavior---from simple overheads of simplistic mechanisms. Straightforward tweaks can reduce these costs easily. 5. The best-known policies (best fit and address-ordered first fit) work better than anyone ever knew, while some policies (Knuth's "Modified First Fit" or "Next Fit") work worse than anyone ever knew, for some programs. This is because randomized traces tend probabilistically to ensure that certain important realistic program behaviors will never happen in traditional experiments. 6. None of the analytical results in the literature (beginning with Knuth's "fifty percent rule") are valid. They're generally based on two or three assumptions that are all systematically false for most real programs. (Random allocation order, steady-state behavior, and independence of sizes and lifetimes. To make mathematical analyses tractable, people have sometimes made stronger and even more systematically false assumptions, like exponentially-distributed random lifetimes.) 7. Because of these problems, most allocator research over the last 35 years has been a big waste of time, and we still don't know much more than we knew in 1968, and almost nothing we didn't know in the mid-70's. > [...] >> "Modified First Fit" with the roving pointer is the clearest example. It >> was a bad idea, and it was quickly shown to be bad, but some other textbook >> writers keep mindlessly cribbing from Knuth, and even some implementors still >> use it. > >Is it really worse than Best Fit? I've wondered about that ever >since first read that book. You seem like a good person to ask. ;) Yes, it is significantly worse, and some versions are often far worse, even pathological. If you keep the free list in address order, it's just worse. If you keep the free list in LIFO order, it can be pathological. (LIFO order is attractive because it's cheaper than address order, and if all other things were equal, it would improve locality.) For some programs, it systematically takes little bites out of big free blocks, making those blocks unusable for allocation requests of the same big size in the future. Many traditional experiments have masked this effect by using smooth distributions of object sizes, so that odd-sized blocks are less of a problem. (Given random sizes, you're likely to have a request that's a good fit for the block size pretty soon, even after taking a bite out of it. For real traces, you're likely *not* to get any requests for comparable-but-smaller sizes anytime soon. And the randomization of order tends to eliminate the potential pathological interactions due to particular sequence behavior. You eliminate the extreme cases.) >> Obligatory positive comment: the best textbook discussion of allocators >> that I know of is Tim Standish's in _Data_Structure_Techniques_. He doesn't >> recognize the near-universal methodological problems with allocator studies, >> but he's unique in recognizing the basic data structure and algorithm issues >> in implementing allocators. > >I'll see if I can find that book. Thanks. > >> [Our] site [http://www.cs.utexas.edu/users/oops/] also has our big survey >> on memory allocators from IWMM '95, which I hope will influence future >> textbooks. It talks about empirical methodology as well as giving a >> fairly exhaustive treatment of implementation techniques. -- | Paul R. Wilson, Comp. Sci. Dept., U of Texas @ Austin (wilson@cs.utexas.edu) | Papers on memory allocators, garbage collection, memory hierarchies, | persistence and Scheme interpreters and compilers available via ftp from | ftp.cs.utexas.edu, in pub/garbage (or http://www.cs.utexas.edu/users/wilson/)